Dual-domain strip attention mechanism is a new method that significantly improves image restoration tasks.
Tuesday, March 5, 2024KnowAgent introduces a novel strategy for boosting large language models' planning skills by integrating explicit action knowledge. The approach helps LLMs perform better in complex tasks by guiding them through more logical planning trajectories.
The powerful DeepSpeed training library from Microsoft has an update that allows models to use 6 bits per parameter. This can speed up inference well over 2x.
ViewFusion is a new algorithm designed to improve how diffusion models generate images from new perspectives, ensuring that the images remain consistent across different views.
Last week, a breakthrough was made in training large models on small GPUs. This config shows how to use these technologies to train Mixtral on consumer hardware.
This project introduces a new method for identifying objects in images taken from various spectrums like RGB, near-infrared, and thermal imaging, focusing on object-centric information to overcome background noise and improve recognition accuracy.
Decompile binary code with large language models.
GroupContrast redefines self-supervised 3D representation learning by integrating segment grouping with semantic-aware contrastive learning.
Researchers have developed a new two-stage training method to improve Visual Geo-localization (VG), enhancing its performance in applications like autonomous driving, augmented reality, and SLAM.
AgentWrite is a method that breaks down lengthy tasks into smaller parts, allowing models to produce coherent outputs exceeding 20,000 words.